Statistical and Machine Learning

Lab4: Classification
Linear Discriminant Analysis

Tsai, Dai-Rong

Dataset

Edgar Anderson’s Iris Data

# Set random seed
set.seed(12345)

# Packages
library(MASS) # for lda, qda
  • Response
    • Species: setosa / versicolor / virginica
  • Predictors
    • Sepal.Length
    • Sepal.Width
    • Petal.Length
    • Petal.Width

dim(iris)
[1] 150   5
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
proportions(table(iris$Species))

    setosa versicolor  virginica 
   0.33333    0.33333    0.33333 
str(iris)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

Create Training/Testing Partitions

  • Split data into 80% training set and 20% test set
nr <- nrow(iris)
train.id <- sample(nr, nr * 0.8)

training <- iris[train.id, ]
testing <- iris[-train.id, ]
  • Check dimension
dim(training)
[1] 120   5
dim(testing)
[1] 30  5

Linear Discriminant Analysis (LDA)

lda.mod <- lda(Species ~ ., training, prior = c(1,1,1)/3)

If prior is unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels.

Components of an lda object

  • prior: the prior probabilities used.
  • means: the group means.
  • scaling : a matrix which transforms observations to discriminant functions, normalized so that within groups covariance matrix is spherical.
  • svd: the singular values, which give the ratio of the between- and within-group standard deviations on the linear discriminant variables. Their squares are the canonical F-statistics.
lda.mod
Call:
lda(Species ~ ., data = training, prior = c(1, 1, 1)/3)

Prior probabilities of groups:
    setosa versicolor  virginica 
   0.33333    0.33333    0.33333 

Group means:
           Sepal.Length Sepal.Width Petal.Length Petal.Width
setosa           5.0075      3.4075       1.4475      0.2325
versicolor       5.9500      2.8100       4.2775      1.3425
virginica        6.6325      2.9900       5.6025      2.0400

Coefficients of linear discriminants:
                  LD1      LD2
Sepal.Length  0.77233 -0.26786
Sepal.Width   1.43010 -2.12129
Petal.Length -2.01887  0.92191
Petal.Width  -3.08463 -2.50828

Proportion of trace:
  LD1   LD2 
0.993 0.007 
plot(lda.mod, col = as.integer(training$Species)+1, cex = 0.5)

  • Retrieve the data used to fit the lda model
lda.fit <- predict(lda.mod)
lda.fit
$class
 [1] virginica  versicolor versicolor versicolor versicolor versicolor setosa     versicolor
 [9] virginica  setosa     virginica  versicolor setosa     setosa     virginica  setosa    
[17] setosa     versicolor setosa     setosa    
 [ reached getOption("max.print") -- omitted 100 entries ]
Levels: setosa versicolor virginica

$posterior
        setosa versicolor  virginica
142 5.1938e-36 3.3938e-04 9.9966e-01
51  7.7151e-18 9.9990e-01 1.0203e-04
58  2.3734e-13 1.0000e+00 6.3224e-08
93  2.2139e-17 9.9999e-01 7.7603e-06
75  3.4610e-17 9.9998e-01 1.9679e-05
96  8.7875e-17 9.9999e-01 1.0777e-05
 [ reached getOption("max.print") -- omitted 114 rows ]

$x
         LD1       LD2
142 -5.20625 -1.868888
51  -1.40229 -0.219119
58  -0.10799  1.753058
93  -1.15701  1.231407
75  -1.17871  0.460048
96  -1.06598  0.594060
2    7.05363  0.735270
86  -2.10175 -1.061552
146 -5.70561 -1.510995
38   8.22015 -0.286676
 [ reached getOption("max.print") -- omitted 110 rows ]
ldahist(lda.fit$x[, 1], g = training$Species, type = "both"); title("LD1", cex.main = 2)
ldahist(lda.fit$x[, 2], g = training$Species, type = "both"); title("LD2", cex.main = 2)

  • Computation of Linear Discriminants (LDs)
(Z <- lda.mod$scaling)
                  LD1      LD2
Sepal.Length  0.77233 -0.26786
Sepal.Width   1.43010 -2.12129
Petal.Length -2.01887  0.92191
Petal.Width  -3.08463 -2.50828
X <- model.matrix(Species ~ ., training)[, -1]
center <- t(lda.mod$prior) %*% lda.mod$means
Xc <- scale(X, center = center, scale = FALSE)
LD <- Xc %*% Z
LD
         LD1       LD2
142 -5.20625 -1.868888
51  -1.40229 -0.219119
58  -0.10799  1.753058
93  -1.15701  1.231407
75  -1.17871  0.460048
96  -1.06598  0.594060
2    7.05363  0.735270
86  -2.10175 -1.061552
146 -5.70561 -1.510995
38   8.22015 -0.286676
 [ reached getOption("max.print") -- omitted 110 rows ]
all.equal(LD, lda.fit$x)
[1] TRUE

Quadratic Discriminant Analysis (QDA)

qda.mod <- qda(Species ~ ., training, prior = c(1,1,1)/3)
qda.mod
Call:
qda(Species ~ ., data = training, prior = c(1, 1, 1)/3)

Prior probabilities of groups:
    setosa versicolor  virginica 
   0.33333    0.33333    0.33333 

Group means:
           Sepal.Length Sepal.Width Petal.Length Petal.Width
setosa           5.0075      3.4075       1.4475      0.2325
versicolor       5.9500      2.8100       4.2775      1.3425
virginica        6.6325      2.9900       5.6025      2.0400

Decision Boundaries

codes for plot
library(ggplot2)

train.2d <- data.frame(lda.fit$x, Species = training$Species)
lda.2d.mod <- lda(Species ~ LD1 + LD2, train.2d)
qda.2d.mod <- qda(Species ~ LD1 + LD2, train.2d)

grid.2d <- expand.grid(lapply(train.2d[1:2], \(x) seq(min(x), max(x), length.out = 1000)))
lda.class <- as.integer(predict(lda.2d.mod, newdata = grid.2d)$class)
qda.class <- as.integer(predict(qda.2d.mod, newdata = grid.2d)$class)
grid.2d.pred <- cbind(grid.2d, lda.class, qda.class)

ggplot(train.2d, aes(x = LD1, y = LD2)) +
  geom_point(aes(fill = Species, shape = Species)) +
  geom_contour(aes(z = lda.class, colour = "LDA"), grid.2d.pred, breaks = c(1.5, 2.5)) +
  geom_contour(aes(z = qda.class, colour = "QDA"), grid.2d.pred, breaks = c(1.5, 2.5)) +
  scale_shape_manual(values = 22:24) +
  scale_colour_discrete(name = "Classifiers") +
  theme_bw()

Prediction

lda.pred <- predict(lda.mod, testing)
qda.pred <- predict(qda.mod, testing)
  • LDA
table(Reality = testing$Species, Prediction = lda.pred$class)
            Prediction
Reality      setosa versicolor virginica
  setosa         10          0         0
  versicolor      0         10         0
  virginica       0          1         9
mean(testing$Species == lda.pred$class)
[1] 0.96667
  • QDA
table(Reality = testing$Species, Prediction = qda.pred$class)
            Prediction
Reality      setosa versicolor virginica
  setosa         10          0         0
  versicolor      0          9         1
  virginica       0          1         9
mean(testing$Species == qda.pred$class)
[1] 0.93333